skip to main content


Search for: All records

Creators/Authors contains: "Gleich, David F."

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract

    Current complex prediction models are the result of fitting deep neural networks, graph convolutional networks or transducers to a set of training data. A key challenge with these models is that they are highly parameterized, which makes describing and interpreting the prediction strategies difficult. We use topological data analysis to transform these complex prediction models into a simplified topological view of the prediction landscape. The result is a map of the predictions that enables inspection of the model results with more specificity than dimensionality-reduction methods such as tSNE and UMAP. The methods scale up to large datasets across different domains. We present a case study of a transformer-based model previously designed to predict expression levels of a piece of DNA in thousands of genomic tracks. When the model is used to study mutations in theBRCA1gene, our topological analysis shows that it is sensitive to the location of a mutation and the exon structure ofBRCA1in ways that cannot be found with tools based on dimensionality reduction. Moreover, the topological framework offers multiple ways to inspect results, including an error estimate that is more accurate than model uncertainty. Further studies show how these ideas produce useful results in graph-based learning and image classification.

     
    more » « less
    Free, publicly-accessible full text available November 17, 2024
  2. Free, publicly-accessible full text available September 30, 2024
  3. null (Ed.)
  4. null (Ed.)
  5. Estrada, Ernesto (Ed.)
    Abstract Preferential attachment (PA) models are a common class of graph models which have been used to explain why power-law distributions appear in the degree sequences of real network data. Among other properties of real-world networks, they commonly have non-trivial clustering coefficients due to an abundance of triangles as well as power laws in the eigenvalue spectra. Although there are triangle PA models and eigenvalue power laws in specific PA constructions, there are no results that existing constructions have both. In this article, we present a specific Triangle Generalized Preferential Attachment Model that, by construction, has non-trivial clustering. We further prove that this model has a power law in both the degree distribution and eigenvalue spectra. 
    more » « less